-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Safer implementation of RepeatN #130887
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Safer implementation of RepeatN #130887
Conversation
This comment has been minimized.
This comment has been minimized.
huh, I'm no llvm expert, buuut...
The other functions ( |
Maybe you can ping scottmcm (test author) to see if the change in codegen result is acceptable (then adjust the test.) |
@Soveu any updates on the failed test? thanks |
It is pretty neat, and in fact it's the implementation I originally wrote for this. The biggest issue I had with it then, though, is that it makes the iterator be potentially-undef, but because of the #130141 bug (which reminds me to go push on rust-lang/rfcs#3712 ) we're stuck with that problem anyway, so there's probably a way you could make this work. Looks like the problem is that it's having trouble figuring out that the store of zero and the store of count-1 can be merged, because otherwise there'd be no difference. Basically, what this test is testing is that for Nothing immediately jumps to mind for how to get LLVM to know that, though. Maybe try a separate I'd suggest copying the implementation to godbolt and playing around a bit there until you find an incantation that works. |
I have tried some things when first writing this code, with no effect, but the idea with fn next(&mut self) -> Option<A> {
let inner = self.inner.as_mut()?;
let count = inner.count.get();
if let Some(new_count) = NonZero::<usize>::new(count - 1) {
let tmp = inner.element.clone();
inner.count = new_count;
return Some(tmp);
}
return if core::mem::needs_drop::<A>() {
self.take_element()
} else {
let tmp = inner.element.clone();
self.inner = None;
Some(tmp)
};
} Godbolt: https://rust.godbolt.org/z/Kr198x635 LLVM output:
LLVM output for the current std implementation:
From what I see, |
That is interesting, just switching variables made it work 🤔 fn next(&mut self) -> Option<A> {
let inner = self.inner.as_mut()?;
let count = inner.count.get();
if let Some(new_count) = NonZero::<usize>::new(count - 1) {
// This must be in this order!
let tmp = inner.element.clone();
inner.count = new_count;
return Some(tmp);
}
return self.take_element();
} |
Definitely odd, but that's why we have codegen tests :P |
This comment has been minimized.
This comment has been minimized.
What that means is that it ended up getting a different layout -- originally it was storing the count at address, but apparently it flipped the order. I don't have a strong feeling about which order is better, but if you want you could put |
…epeatn, r=<try> Use `iter::repeat_n` to implement `Vec::extend_with` This replaces the `Vec::extend_with` manual implementation, which is used by `Vec::resize` and `Vec` `SpecFromElem`, with `iter::repeat_n`. I've compared the codegen output between: 1. the current `Vec::resize` impl 2. this branch 3. this branch + rust-lang#130887 3 gives the closest codegen output to 1, with some output improvements. 2 doesn't look good: https://rust.godbolt.org/z/Yrc83EhjY. May also help rust-lang#120050?
@rustbot label +S-waiting-on-review -S-waiting-on-author |
…epeatn, r=<try> Use `iter::repeat_n` to implement `Vec::extend_with` This replaces the `Vec::extend_with` manual implementation, which is used by `Vec::resize` and `Vec` `SpecFromElem`, with `iter::repeat_n`. I've compared the codegen output between: 1. the current `Vec::resize` impl 2. this branch 3. this branch + rust-lang#130887 3 gives the closest codegen output to 1, with some output improvements. 2 doesn't look good: https://rust.godbolt.org/z/Yrc83EhjY. May also help rust-lang#120050? --- WARNING: DO NOT MERGE - in order to run the perf run in rust-lang#133662 (comment) this PR currently also contains commits from rust-lang#130887
FYI we're also playing around with this change at #133662 since Compiler Explorer showed that the two MRs combined made |
Nice, this reminds me of my article on Rust Magazine: VecDeque::resize() optimization |
Any news? Also #138833 may create a merge conflict with this. |
r? @scottmcm |
} | ||
|
||
#[derive(Clone, Copy)] | ||
#[repr(C)] // keep the layout consistent for codegen tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I think I have you bad advice here in the past.
Rather than a repr
here, I've since learned that you should instead add
//@ needs-deterministic-layouts
to any of the codegen tests that break under layout randomization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks basically good to me, though I did see one thing that I put above.
Can you please rebase to resolve the merge conflict, then mark this ready again?
@rustbot author
Reminder, once the PR becomes ready for a review, use |
I've seen the "Use MaybeUninit for RepeatN" commit while reading This Week In Rust and immediately thought about something I've written some time ago - https://github.com/Soveu/repeat_finite/blob/master/src/lib.rs.
Using the fact, that
Option
will find niche in(T, NonZeroUsize)
, we can construct something that has the same size as(T, usize)
while completely getting rid ofMaybeUninit
.This leaves only
unsafe
onTrustedLen
, which is pretty neat.